24 research outputs found

    Faster subsequence recognition in compressed strings

    Full text link
    Computation on compressed strings is one of the key approaches to processing massive data sets. We consider local subsequence recognition problems on strings compressed by straight-line programs (SLP), which is closely related to Lempel--Ziv compression. For an SLP-compressed text of length mˉ\bar m, and an uncompressed pattern of length nn, C{\'e}gielski et al. gave an algorithm for local subsequence recognition running in time O(mˉn2logn)O(\bar mn^2 \log n). We improve the running time to O(mˉn1.5)O(\bar mn^{1.5}). Our algorithm can also be used to compute the longest common subsequence between a compressed text and an uncompressed pattern in time O(mˉn1.5)O(\bar mn^{1.5}); the same problem with a compressed pattern is known to be NP-hard

    Design Principles for Sparse Matrix Multiplication on the GPU

    Full text link
    We implement two novel algorithms for sparse-matrix dense-matrix multiplication (SpMM) on the GPU. Our algorithms expect the sparse input in the popular compressed-sparse-row (CSR) format and thus do not require expensive format conversion. While previous SpMM work concentrates on thread-level parallelism, we additionally focus on latency hiding with instruction-level parallelism and load-balancing. We show, both theoretically and experimentally, that the proposed SpMM is a better fit for the GPU than previous approaches. We identify a key memory access pattern that allows efficient access into both input and output matrices that is crucial to getting excellent performance on SpMM. By combining these two ingredients---(i) merge-based load-balancing and (ii) row-major coalesced memory access---we demonstrate a 4.1x peak speedup and a 31.7% geomean speedup over state-of-the-art SpMM implementations on real-world datasets.Comment: 16 pages, 7 figures, International European Conference on Parallel and Distributed Computing (Euro-Par) 201

    Compressed Subsequence Matching and Packed Tree Coloring

    Get PDF
    We present a new algorithm for subsequence matching in grammar compressed strings. Given a grammar of size nn compressing a string of size NN and a pattern string of size mm over an alphabet of size σ\sigma, our algorithm uses O(n+nσw)O(n+\frac{n\sigma}{w}) space and O(n+nσw+mlogNlogwocc)O(n+\frac{n\sigma}{w}+m\log N\log w\cdot occ) or O(n+nσwlogw+mlogNocc)O(n+\frac{n\sigma}{w}\log w+m\log N\cdot occ) time. Here ww is the word size and occocc is the number of occurrences of the pattern. Our algorithm uses less space than previous algorithms and is also faster for occ=o(nlogN)occ=o(\frac{n}{\log N}) occurrences. The algorithm uses a new data structure that allows us to efficiently find the next occurrence of a given character after a given position in a compressed string. This data structure in turn is based on a new data structure for the tree color problem, where the node colors are packed in bit strings.Comment: To appear at CPM '1

    The design and analysis of bulk-synchronous parallel algorithms

    No full text
    SIGLEAvailable from British Library Document Supply Centre-DSC:D206326 / BLDSC - British Library Document Supply CentreGBUnited Kingdo

    Efficient representation and parallel computation of string-substring longest common subsequences

    No full text
    published in Efficient representation and parallel computation of string-substring longest common subsequence

    Efficient representation and parallel computation of string-substring longest common subsequences

    No full text
    Given two strings a, b of length m, n respectively, the string-substring longest common subsequence (SS-LCS) problem consists in computing the length of the longest common subsequence of a and every substring of b. An explicit representation of the output lengths is of size Θ(n2). We show that the output can be represented implicitly by a set of n two-dimensional integer points, where individual output lengths are obtained by dominance counting queries. This leads to a data structure of size O(n), which allows to query an individual output length log n in time O (), using a recent result by JaJa, Mortensen and Shi. The currently best log log n sequential SS-LCS algorithm by Alves et al. can be adapted to produce the output in the above geometric representation. We also develop a new parallel SS-LCS algorithm that runs on a p-processor coarse-grained computer in O ( mn) local computation, O(n log p) commu-p nication, O(log p) barrier synchronisations, and O(n) memory per processor, producing the output in the above geometric representation. Compared to previously known results, our approach presents a substantial improvement in algorithm functionality, output representation efficiency, communication efficiency and/or memory efficiency. 1

    Minimum-weight double-tree shortcutting for Metric TSP: Bounding the approximation ratio ✩

    No full text
    The Metric Traveling Salesman Problem (TSP) is a classical NP-hard optimization problem. The double-tree shortcutting method for Metric TSP yields an exponentially-sized space of TSP tours, each of which approximates the optimal solution within at most a factor of 2. We consider the problem of finding among these tours the one that gives the closest approximation, i.e. the minimum-weight double-tree shortcutting. Previously, we gave an efficient algorithm for this problem, and carried out its experimental analysis. In this paper, we address the related question of the worst-case approximation ratio for the minimum-weight double-tree shortcutting method. In particular, we give lower bounds on the approximation ratio in some specific metric spaces: the ratio of 2 in the discrete shortest path metric, 1.622 in the planar Euclidean metric, and 1.666 in the planar Minkowski metric. The only known upper bound is 2, which holds trivially in any metric space. We conjecture that for the Euclidean and Minkowski metrics, the upper bound can be improved to match our lower bounds. 1
    corecore